-
Notifications
You must be signed in to change notification settings - Fork 868
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AOTInductor BERT CPP example #2931
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left some comments
cpp/build.sh
Outdated
echo -e "${COLOR_GREEN}[ INFO ] Cloning tokenizers-cpp repo ${COLOR_OFF}" | ||
git clone https://github.com/mlc-ai/tokenizers-cpp.git "$TOKENIZERS_CPP_SRC_DIR" | ||
cd $TOKENIZERS_CPP_SRC_DIR | ||
git submodule update --init --recursive |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better to create a submodule in third-party directly instead of cloning it manually. That way we freeze a specific commit and nothing breaks if the tokenizer-cpp repo gets updated. See llama2.so for example. git submodule update --init --recursive is executed in build.sh for all our submodules.
PYTHONPATH=${LLAMA_SO_DIR}:${PYTHONPATH} python ${BASE_DIR}/../examples/cpp/aot_inductor/llama2/compile.py --checkpoint ${HANDLER_DIR}/stories15M.pt ${HANDLER_DIR}/stories15M.so | ||
fi | ||
if [ ! -f "${EX_DIR}/aot_inductor/bert_handler/bert-seq.so" ]; then | ||
pip install transformers |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will better fit into the cpp section of ts_scripts/install_dependencies.py
) | ||
|
||
set_seed(1) | ||
# PT2.2 has limitation on the max |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you link the issue or PR here that describes the problem in 2.2?
NEW_DIR, | ||
) | ||
|
||
model.save_pretrained(NEW_DIR) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we saving the model here? Its already in the hub cache so no need to save it again.
device = torch.device("cuda" if torch.cuda.is_available() else "cpu") | ||
|
||
model = model.to(device=device) | ||
dummy_input = "This is a dummy input for torch jit trace" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To not be misleading we should change this into "... for torch export"
#include <folly/FileUtil.h> | ||
#include <folly/dynamic.h> | ||
#include <folly/json.h> | ||
#include <fmt/format.h> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See #2944 (comment) regarding header placement
return data; | ||
} | ||
|
||
std::unique_ptr<folly::dynamic> BertCppHandler::LoadJsonFile(const std::string& file_path) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should move these into ts_utils
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, lets just check if we need kineto for mac and if not lets remove it.
@@ -394,10 +370,9 @@ cd $BASE_DIR | |||
git submodule update --init --recursive | |||
|
|||
install_folly | |||
install_kineto | |||
#install_kineto |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you check if we actually need this for mac install? And lets remove it if its not necessary
* fix compile error on mac x86 * update install libtorch * fmt * fmt * fmt * Set return type of bert model and dynamic shapes * fix json value * fix build on linux * add linux dependency * replace sentenepice with tokenizers-cpp * update dependency * add attention mask * fix compile error * fix compile error * fmt * Fmt * tockenizer-cpp git submodule * update handler * fmt * fmt * fmt * unset env * fix path * Fix type error in bert aot example * fmt * fmt * update max setting * fix lint * add limitation * pinned folly to v2024.02.19.00 * pinned yam-cpp with tags/0.8.0 * pinned yaml-cpp 0.8.0 * update build.sh * pinned yaml-cpp v0.8.0 * fmt * fix typo * add submodule kineto * fmt * fix workflow * fix workflow * fix ubuntu version * update readme --------- Co-authored-by: Matthias Reso <13337103+mreso@users.noreply.github.com>
Description
Please read our CONTRIBUTING.md prior to creating your first pull request.
Please include a summary of the feature or issue being fixed. Please also include relevant motivation and context. List any dependencies that are required for this change.
Fixes #(issue)
Type of change
Please delete options that are not relevant.
Feature/Issue validation/testing
Please describe the Unit or Integration tests that you ran to verify your changes and relevant result summary. Provide instructions so it can be reproduced.
Please also list any relevant details for your test configuration.
Checklist: